Alphabet-Dependent String Searching with Wexponential Search Trees

نویسندگان

  • Johannes Fischer
  • Pawel Gawrychowski
چکیده

It is widely assumed that O(m+ lg σ) is the best one can do for finding a pattern of length m in a compacted trie storing strings over an alphabet of size σ, if one insists on linear-size data structures and deterministic worst-case running times [Cole et al., ICALP’06]. In this article, we first show that a rather straightforward combination of well-known ideas yields O(m+lg lg σ) deterministic worst-case searching time for static tries. Then we move on to dynamic tries, where we achieve a worst-case bound of O(m+ lg 2 lg σ lg lg lg σ ) per query or update, which should again be compared to the previously known O(m + lg σ) deterministic worst-case bounds [Cole et al., ICALP’06], and to the alphabet independent O(m+ √ lg n/ lg lg n) deterministic worst-case bounds [Andersson and Thorup, SODA’01], where n is the number of nodes in the trie. The basis of our update procedure is a weighted variant of exponential search trees which, while simple, might be of independent interest. As one particular application, the above bounds (static and dynamic) apply to suffix trees. There, an update corresponds to preor appending a letter to the text, and an additional goal is to do the updates quicker than rematching entire suffixes. We show how to do this in O(lg lg n + lg 2 lg σ lg lg lg σ ) time, which improves the previously known O(lg n) bound [Amir et al., SPIRE’05].

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Space Eecient Suux Trees

We give the rst representation of a suux tree that uses n lg n + O(n) bits of space and supports searching for a pattern string in the given text (from a xed size alphabet) in O(m) time, where n is the size of the text and m is the length of the pattern. The structure is quite simple and answers a question raised by Muthukrishnan in 22]. Previous compact representations of suux trees had either...

متن کامل

On-Line Approximate String Searching Algorithms: Survey and Experimental Results

The problem of approximate string searching comprises two classes of problems: string searching with k mismatches and string searching with k differences. In this paper we present a short survey and experimental results for well known sequential approximate string searching algorithms. We consider algorithms based on different approaches including dynamic programming, deterministic finite autom...

متن کامل

String Searching over Small Alphabets

We propose a new string searching procedure inspired by the Boyer–Moore algorithm. The two key ideas of our improvement are to keep track of all the previously matched characters within the current alignment and not to move the reading position unconditionally to the end of the pattern when a mismatch occurs. The result is an algorithm with increased average shift amounts and a guarantee that a...

متن کامل

Inferring Strings from Suffix Trees and Links on a Binary Alphabet

A suffix tree, which provides us with a linear space full-text index of a given string, is a fundamental data structure for string processing and information retrieval. In this paper we consider the reverse engineering problem on suffix trees: Given an unlabeled ordered rooted tree T accompanied with a node-to-node transition function f , infer a string whose suffix tree and its suffix links fo...

متن کامل

Self-indexing Natural Language

Self-indexing is a concept developed for indexing arbitrary strings. It has been enormously successful to reduce the size of the large indexes typically used on strings, namely suffix trees and arrays. Selfindexes represent a string in a space close to its compressed size and provide indexed searching on it. On natural language, a compressed inverted index over the compressed text already provi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015